pacman::p_load(sf, raster, spatstat,sparr,tmap,tidyverse,spNetwork,lubridate,gridExtra,readr,reshape2,viridis,gifski,classInt,knitr)Take home exercise 1
Thailand Road Accident Case Study

1 The scene
Overview According to the World Health Organization (WHO), road traffic accidents cause approximately 1.19 million deaths annually, with 20-50 million non-fatal injuries. Vulnerable road users, such as pedestrians, cyclists, and motorcyclists, make up more than half of all deaths. Road traffic injuries are the leading cause of death for children and young adults aged 5–29, with two-thirds of fatalities occurring among working-age individuals (18–59 years). Low- and middle-income countries, which own 60% of the world’s vehicles, account for 90% of road fatalities. Road accidents also impose significant economic burdens, costing countries 3% of their annual GDP.
Thailand has the highest road death rate in Southeast Asia, with around 20,000 deaths annually. From 2014 to 2021, 19% of accidents occurred on national highways, and there was a 66% chance of encountering accident-prone zones (“black spots”), with 66% on straight roads, 13% at curves, and smaller percentages at intersections, bridges, and slopes.
1.1 Objectives
we are tasked to discover factors affecting road traffic accidents in the Bangkok Metropolitan Region BMR by employing both spatial spatio-temporal point patterns analysis methods.
The specific objectives of this take-home exercise are as follows:
To visualize the spatio-temporal dynamics of road traffic accidents in BMR using appropriate statistical graphics and geovisualization methods.
To conduct detailed spatial analysis of road traffic accidents using appropriate Network Spatial Point Patterns Analysis methods.
To conduct detailed spatio-temporal analysis of road traffic accidents using appropriate Temporal Network Spatial Point Patterns Analysis methods.
1.3 The Data
For the purpose of this exercise, three basic data sets are needed, they are:
Thailand Road Accident [2019-2022] on Kaggle,a point feature layer in CSV format: The dataset contains point features representing road accidents in Thailand from 2019 to 2022. Each feature represents a unique accident event, with spatial attributes (latitude and longitude) and other associated accident data, such as the date, time, severity, and road type.
Thailand Roads (OpenStreetMap Export) on HDX,a line feature layer in ESRI Shapefile format: The dataset contains line features representing the road network of Thailand. Each feature represents a section of a road, with spatial attributes and additional road-related information.
Thailand - Subnational Administrative Boundaries on HDX,a polygon feature layer in ESRI Shapefile format.
2 Installing and Loading the R packages
Explanations for the imported library:
sf: For handling geospatial data and geometries.
raster: Deals with raster data and geographic grid layers.
spatstat: Spatial data analysis and point pattern analysis.
sparr: Performs spatial relative risk estimation and kernel density estimation.
tmap: For interactive and static maps.
tidyverse: For data manipulation and visualization.
spNetwork: Network-based spatial analysis.
lubridate: Simplifies working with dates and times.
gridExtra: For arranging and combining plots.
readr: Fast reading of rectangular data.
reshape2: For reshaping data.
viridis: Provides color maps for data visualization.
gifski: For GIF creation.
classInt: For interval classification.
knitr: Creates dynamic reports in R
2.1 Spatial Data Wrangling
2.1.1 Importing & Converting an Aspatial Data
We first select the Bangkok Metropolitan Region (BMR).
selected_provinces <- c("Bangkok", "Nonthaburi", "Pathum Thani", "Samut Prakan", "Nakhon Pathom", "Samut Sakhon","knitr")2.1.2 Data preprocessing and transformation
acci <- read_csv("data/rawdata/thai_road_accident_2019_2022.csv") %>%
mutate(Year = year(incident_datetime)) %>%
mutate(Month_num = month(incident_datetime)) %>%
mutate(Month_fac = month(incident_datetime, label = TRUE, abbr = TRUE)) %>%
mutate(dayofweek = wday(incident_datetime, label = TRUE)) %>%
mutate(dayofyear = yday(incident_datetime)) %>%
mutate(hour = hour(incident_datetime)) %>%
mutate(Month = floor_date(incident_datetime, "month")) %>%
filter(!is.na(longitude) & longitude != "",
!is.na(latitude) & latitude != "",
province_en %in% selected_provinces) %>%
st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
st_transform(crs = 32647) %>%
mutate(Time = as.numeric(difftime(incident_datetime, min(incident_datetime), units = "days")))Rows: 81735 Columns: 18
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (10): province_th, province_en, agency, route, vehicle_type, presumed_c...
dbl (6): acc_code, number_of_vehicles_involved, number_of_fatalities, numb...
dttm (2): incident_datetime, report_datetime
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Following steps:
Load Data: Reads a CSV file containing accident spatial data.
Date-Time Operations: Extracts year, month, day, and hour from the accident’s date and time, as well as adds new columns for day of the week and day of the year.
Filter Data: Removes records with missing or invalid latitude/longitude and limits data to specific provinces.
Spatial Transformation: Converts the accident points into spatial objects with proper CRS and transforms the coordinates.
Add Time Variable: Computes the time difference from the first incident in days.
write_rds(acci,"data/rds/acci.rds")
acci <- read_rds("data/rds/acci.rds")glimpse(acci)Rows: 12,986
Columns: 25
$ acc_code <dbl> 571882, 600001, 605043, 629691, 571887, 59…
$ incident_datetime <dttm> 2019-01-01 02:25:00, 2019-01-01 03:00:00,…
$ report_datetime <dttm> 2019-01-02 17:32:00, 2019-01-05 10:33:00,…
$ province_th <chr> "นครปฐม", "นนทบุรี", "สมุทรปราการ", "กรุงเทพมห…
$ province_en <chr> "Nakhon Pathom", "Nonthaburi", "Samut Prak…
$ agency <chr> "department of rural roads", "department o…
$ route <chr> "แยกทางหลวงหมายเลข 4 (กม.ที่ 51+920) - บ้านวัด…
$ vehicle_type <chr> "motorcycle", "private/passenger car", "pr…
$ presumed_cause <chr> "speeding", "speeding", "running red light…
$ accident_type <chr> "rollover/fallen on straight road", "rollo…
$ number_of_vehicles_involved <dbl> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 3, 2, …
$ number_of_fatalities <dbl> 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, …
$ number_of_injuries <dbl> 2, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, …
$ weather_condition <chr> "clear", "clear", "clear", "clear", "clear…
$ road_description <chr> "straight road", "straight road", "other",…
$ slope_description <chr> "no slope", "no slope", "other", "other", …
$ Year <dbl> 2019, 2019, 2019, 2019, 2019, 2019, 2019, …
$ Month_num <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ Month_fac <ord> Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Ja…
$ dayofweek <ord> Tue, Tue, Tue, Tue, Tue, Tue, Tue, Tue, Tu…
$ dayofyear <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ hour <int> 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 9, 10, 11, 1…
$ Month <dttm> 2019-01-01, 2019-01-01, 2019-01-01, 2019-…
$ geometry <POINT [m]> POINT (627012.3 1533381), POINT (655…
$ Time <dbl> 0.00000000, 0.02430556, 0.02430556, 0.0277…
Notice: Table above shows the content of acci, a new column called geometry has been added into the data frame. On the other hand, the longitude and latitude columns have been dropped from the data frame.
2.1.3 Mapping the spatial data set
tmap_mode('view')tmap mode set to interactive viewing
tm_shape(acci)+
tm_dots(col = 'red')